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Abstract — The smart power grid aims at harnessing information 
and communication technologies to enhance reliability and enforce 
sensible use of energy. Its realization is geared by the fundamental 
goal of effective management of demand load. In this work, we 
envision a scenario with real-time communication between the 
operator and consumers. The grid operator controller receives 
requests for power demands from consumers, each with different 
power requirement, duration, and a deadline by which it is to 
be completed. The objective of the operator is to devise a power 
demand task scheduling policy that minimizes the grid operational 
cost over a time horizon. The operational cost is a convex function 
of instantaneous total power consumption and reflects the fact that 
each additional unit of power needed to serve demands is more 
expensive as the demand load increases. 

First, we study the off-line demand scheduling problem, where 
parameters are fixed and known a priori. If demands may be 
scheduled preemptively, the problem is a load balancing one, 
and we present an iterative algorithm that optimally solves it. If 
demands need to be scheduled non-preemptively, the problem is 
a bin packing one. Next, we devise a stochastic model for the case 
when demands are generated continually and scheduling decisions 
are taken online and focus on long-term average cost. We present 
two instances of power consumption control based on observing 
current consumption. In the first one, the controller may choose 
to serve a new demand request upon arrival or to postpone it to 
the end of its deadline. The second one has the additional option 
to activate one of the postponed demands when an active demand 
terminates. For both instances, the optimal policies are threshold- 
based. We derive a lower performance bound over all policies, 
which is asymptotically tight as deadlines increase. We propose the 
Controlled Release threshold policy and prove it is asymptotically 
optimal. The policy activates a new demand request if the current 
power consumption is less than a threshold, otherwise it is queued. 
Queued demands are scheduled when their deadline expires or 
when the consumption drops below the threshold. 

I. Introduction 

The smart power grid is currently considered a major chal- 
lenge for harnessing information and communication tech- 
nologies to enhance the electric grid flexibility and reliabil- 
ity, enforce sensible use of energy and enable embedding 
of different types of grid resources to the system. These 
resources include renewable ones, distributed micro-generator 
customer entities, electric storage, and plug-in electric vehicles 
QJ. The smart power grid shall incorporate new technologies 
that currently experience rapid progress, such as advanced 
metering, automation, bi-directional communication, distributed 



power generation and storage. The ultimate interconnection 
and real-time communication between the consumer and the 
market/system operator premises will be realized through IP 
addressable components over the internet 12. 

The design and realization of the smart power grid is geared 
by the fundamental goal of effective management of power sup- 
ply and demand loads. Load management is primarily employed 
by the power utility system operator with the objective to match 
the power supply and demand profiles in the system. Since the 
supply profile shaping depends highly on demand profile, the 
latter constitutes the primary substrate at which control should 
be exercised by the operator. The basic objective therein is 
to alleviate peak load by transferring non-emergency power 
demands at off-peak-load time intervals. 

Demand load management does not significantly reduce total 
energy consumption since most of the curtailed demand jobs are 
transferred from peak to off-peak time intervals. Nevertheless, 
load management aids in smoothing the power demand profile 
of the system across time by avoiding power overload periods. 
By continuously striving to maintain the total demand to be 
satisfied below a critical load, grid reliability is increased as 
grid instabilities caused by voltage fluctuations are reduced. 
Further, the possibility of power outage due to sudden increase 
of demand or contingent malfunction of some part of the system 
is decreased. More importantly, demand load management 
reduces or eliminates the need for inducing supplementary 
generated power into the grid to satisfy increased demand 
during peak hours. This supplementary power is usually much 
more costly to provide for the operator than the power for 
average base consumed load, since it emanates from gas micro- 
turbines or is imported from other countries at a high price. 
Thus, from the point of view of system operator, effective 
demand load management reduces the cost of operating the 
grid, while from the point of view of the user, it lowers real- 
time electricity prices. 

In this paper, we make a first attempt to formulate and 
solve the basic control and optimization problem faced by 
the power grid operator so as to achieve the goals above. 
We envision a scenario with real-time communication between 
the operator and consumers through IP addressable smart 
metering devices installed at the consumer and operator sides. 



The grid operator has full control over consumer appliances. 
The operator controller receives power demand requests from 
different consumers, each with different power requirements, 
different duration (which sometimes may be even unknown), 
and different flexibility in its satisfaction. Flexibility is modeled 
as a deadline by which each demand needs to be completed. 
The objective of the grid operator is to devise a power demand 
task scheduling policy that minimizes the grid operational cost 
over a time horizon. The operational cost is modeled as a 
convex function of instantaneous total power consumption in 
the system, so as to reflect the fact that each additional Watt 
of power needed to serve power demands is more expensive as 
the total power demand increases. 

A. State-of-the-art 

In the power engineering terminology, the power demand 
management method above is known as demand response sup- 
port |3). Demand response is currently realized mostly through 
static contracts that offer consumers lower prices for the power 
consumed at off-peak hours, and they rely on customer vol- 
untary participation. A recent development involves real-time 
pricing but still needs manual turning off of appliances. Cur- 
rently, there exists significant research activity in automating 
the process of demand response through developing appropriate 
enabling technologies that reduce power consumption at times 
of peak demand J4|. GridWise is an important research 
initiative in USA with this goal. 

In one form, the automation process may involve regulation 
of power consumption level of consumer appliances like heaters 
or air conditioners (A/C) by the operator, or slight delaying of 
consumption until the peak demand is reduced. For instance, 
in the Toronto PeakSaver AC pilot program |6|, the operator 
can automatically control A/Cs during peak demand through an 
installed switch at the central A/C unit, thus in essence shifting 
portions of power consumption in time. Lockheed Martin has 
developed the SeeLoad™ system to realize efficient demand 
response in real-time. Other efforts like the EnviroGrid™ by 
REGEN Energy Inc. are based on self-regulation of energy 
consumption of appliances within the same facility without 
intervention of the operator, through controllers connected in a 
ZigBee wireless network (8|. In an automated dynamic pricing 
and appliance response scenario, the work [5j addresses the 
decision problem faced by home appliances of when to request 
instantaneous power price from the grid so as to perform 
power consumption adaptation. The problem is modeled as a 
Markov Decision Process subject to a cost of obtaining the 
price information. 

At the level of modeling abstraction, the problem of smooth- 
ing power demand bears may slightly relate to that of schedul- 
ing tasks under deadline constraints in order to optimize total 
cost over a time horizon. There exists much literature on 
machine scheduling under deadline constraints in operations 
research literature, for optimizing mainly linear functions of 
the load flOl Chap.21-22]. For wire-line networks, the Earliest 
Deadline First (EDF) scheduling rule is optimal in the sense of 
minimizing packet loss due to deadline expirations (TQ. 



Scheduling under deadlines with convex cost models gained 
momentum recently in wireless networks because the expended 
transmission energy is convex in throughput. In lfl2l . the 
authors solve the deterministic scheduling problem with a priori 
known packet arrival times under certain deadlines so as to 
minimize the total consumed energy if only one packet can 
be transmitted at a time. The work in Ifl3l studies properties 
of the optimal off-line solution for the same problem and 
proposes heuristic online scheduling algorithms. In lfT4l the 
authors consider the problem of minimizing the energy needed 
to transmit a certain amount of data within a given time 
interval over a time-varying wireless link. Energy is a convex 
function of the amount of data sent each time. The non-causal 
problem when link quality is known a priori is solved by convex 
optimization. The online problem where link quality each time 
is revealed to the controller just before decision is solved by 
dynamic programming. The optimal policy is of threshold type 
on the energy cost of sending data immediately versus saving 
it for later transmission. The multi-user version of the problem 
is studied in lfT31 . In lfl6l the problem of transmit rate control 
for minimizing energy over a finite horizon is solved through 
continuous time optimization, and optimal transmission policies 
are derived in terms of continuous functions of time that satisfy 
certain quality of service curves. Finally, the works ifPTll . ifTll 
present a long-term view on probabilistic latency guarantees 
per packet in wireless networks based on a primal-dual-like 
algorithm for a utility maximization problem with a constraint 
on latency guarantees. 

B. Our contribution 

In this paper, we address the problem of optimal power 
demand scheduling subject to deadlines in order to minimize 
the cost over a time horizon. The problem is faced by a grid 
operator that has full control over the consumer appliances. To 
the best of our knowledge, this is the first work that attempts 
to characterize structural properties of the problem and the 
solutions in the context of smart grid power demand load 
management. The contribution of our work to the literature is 
as follows: 

• We formulate the off-line version of the demand scheduling 
problem for a certain time horizon, where the demand 
task generation pattern, duration, power requirement and 
deadline for each task are fixed and given a priori. We 
distinguish between elastic and inelastic demands that give 
rise to preemptive and non-preemptive task scheduling 
respectively. In the first case, the problem is a load 
balancing one, and we present an iterative algorithm that 
optimally solves it. In the second case, the problem is 
equivalent to bin packing, and thus it is NP-Hard. 

• We study the online dynamic scheduling problem. We 
propose a stochastic model for the case when demands are 
generated continually and scheduling decisions are taken 
online, and we consider minimizing the long-term average 
cost. First, we derive the performance of the simplest 
default policy which is to schedule each task upon arrival. 
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Fig. 1. Power demand task related parameters. Power demand task n = 1,2,3 
is generated at time a n , has duration s n , power requirement p n and needs to 
be completed by d n . 
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Fig. 2. Overview of system architecture. The smart grid-enabled appliances 
send power demand requests to the smart consumer device, which further 
dispatches them to the controller at the operator side. The controller returns 
a schedule for each task which is passed to the appliances through the smart 
consumer device. 



Next, we present two instances of power consumption 
control based on observing current power consumption. 
In the first one, the controller may choose to serve a new 
demand request upon arrival or to postpone it until the 
end of its deadline. The second one is more enhanced and 
has the additional option to activate one of the postponed 
demands when a demand under service terminates. For 
both instances above, the optimal policies are threshold- 
based. 

« We derive a lower performance bound over all policies, 
which is asymptotically tight as deadlines increase by 
showing a sequence of policies that achieves the bound. 
We propose the Controlled Release threshold policy and 
prove it is asymptotically optimal in that it achieves the 
bound specified above. The policy activates a new demand 
request if the current power consumption is less than 
a threshold, otherwise it is queued. Queued demands 
are scheduled when their deadline expires or when the 
consumption drops below the threshold. 

The paper is organized as follows. In section ITT1 we present the 
model and assumptions and in section [Til] we study the off-line 
version of the problem. Section [TV] contains the study about the 
online version of the problem, the derived lower bound and the 



optimal policies, and section IVl concludes our study. 
II. The Model 

We consider a controller located at the electric utility operator 
premises, with bi-directional communication to some smart 
devices each of which is located at a consumer's premises. 
Each smart device at a consumer side is connected to smart grid 
enabled appliances. The smart device collects power demand 
requests from individual appliances. These requests can be 
either manually entered by the user at the times of interest or 
they can be generated based on some automated process. Each 
power demand task n, n = 1,2, ... , has a time of generation 
a n , a time duration s„ time units, and an instantaneous power 
requirement p n (in Watts) when the corresponding task is 
activated and consumes power. Each task is characterized by 
some temporal flexibility or delay tolerance in being activated, 
which is captured by a deadline d n > a n by which it needs to 
be completed. For example, some appliances (e.g. lights) have 
zero delay tolerance, while others (e.g. washing machine) have 
some delay tolerance. Figure Q] depicts the parameters defined 
above for three tasks. 

We assume that all demand tasks shall be eventually sched- 
uled, at the latest by their deadlines. In other words, there are 
no demand task losses in the system. A task may be scheduled 
to take place non-preemptively or preemptively. In the first case, 
once it starts, a task n is active for s n consecutive time units 
until completion. Thus, each task is scheduled at some time 
t n S [a n , d n — s n ], or in other words it is scheduled with a time 
shift r n S [0, D n ] after its arrival, where D n = d n — s n — a n . 
In the case of preemptive scheduling, each task n may be 
scheduled with interruptions within the prescribed tolerance 
interval as long as it is finished on time. We assume that 
the instantaneous power consumption p n of a task cannot be 
adapted by the controller. Nevertheless, the possibility of having 
adaptable p n by the operator controller could be incorporated 
in our formulation. 

The controller receives power demand requests from smart 
devices and it needs to decide on the time that the differ- 
ent power demand tasks are activated. Then, it sends the 
corresponding command for activation to the smart device 
from which the task emanated. The smart device transfers 
the command to the corresponding appliance, and the power 
demand is activated at the time prescribed by the operator 
controller (Fig. [2). The communications from the controller to 
the smart devices and from them to the appliances take place 
through a high-speed connection and thus incur zero delay. 
We assume that the the grid operator has full control over the 
individual consumer appliances and that the appliances comply 
to the dictated schedule and start the task at the prescribed time. 

In this work, we consider two versions of the problem: 

1) An off-line one for cost minimization over a time hori- 
zon, where the power demand generation times, dura- 
tions, power requirements and deadlines are known non- 
causally to the controller. This is valid for cases where 
off-line scheduling can be used. Under those non-realistic 
assumptions we also obtain performance bounds. 
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Fig. 3. A piecewise linear convex cost function C(P) of instantaneous power 
consumption P with L = 3 power consumption classes, defined by lines 
kiP + hi, i = 1, 2, 3. Two cross-over points Pi, Pi distinguish the different 
classes. 



2) An online one for long-term average cost minimization, 
where quantities are stochastic. This fluid model captures 
the case where demands are generated continually and 
scheduling decisions are taken online. 

A. Cost Model 

At each time t, let P(t) denote the total instantaneous 
consumed power in the system. This is the summation over all 
active tasks, i.e. tasks that consume power at time t. We denote 
instantaneous cost associated with power consumption P(t) at 
time t as C(P(t)), where C(-) is an increasing, differentiable 
convex function. Convexity of C(-) reflects the fact that the 
differential cost of power consumption for the electric utility 
operator increases as the demand increases. That is, each unit of 
additional power needed to satisfy increasing demand becomes 
more expensive to obtain and make available to the consumer. 
For instance, supplementary power for serving periods of high 
demands may be generated from expensive means, or it may 
be imported at high prices from other countries. In its simplest 
form, the cost may be a piecewise linear function of the form: 



C(x) 



max \kjX 
=i,....l 1 



(i) 



with fci < . . . < Ul, accounting for L different classes of power 
consumption, where each additional Watt consumed costs more 
when at class I than at class (£ — 1), I = 2, . . . , L (Fig. [3}. In 
our model, we shall consider a generic convex function C(-). 

III. The Off-line Demand Scheduling Problem 

First, we consider the off-line version of the demand 
scheduling problem for N power demand tasks. For each task 
n = l,...,N, the generation time a n , power consumption 
p n , duration s n and deadline d n are deterministic quantities 
which are non-causally known to the controller before time 
t = 0. This version of the problem may arise if task properties 
can be completely predictable (for instance, if tasks exhibit 
time periodicity) and in any case provides useful performance 
bounds. Fix attention to a finite horizon T. 



1 ) Preemptive scheduling of tasks: Consider first the case 
of elastic demands, which implies that each demand task 
n may get preemptive service, i.e. it does not need to be 
served contiguously. Namely, each task may be interrupted and 
continued later such that it is active at nonconsecutive time 
intervals, provided of course that it will be completed by its 
specified time d n . Each task n has fixed power requirement p n 
when it is active. 

For each task n and time t, define the function x n (t), 
which is 1, if job n is active at time t, t e [0,T], and 
otherwise. A scheduling policy is a collection of functions X = 
{x\(t), . . . , X]sr(t)}, defined on interval [0,T]. The controller 
needs to find the scheduling policy that minimizes the total 
cost in horizon [0,T], where at each time i, the instantaneous 
cost is a convex function of total instantaneous power load. The 
optimization problem faced by the controller is: 



min J c(^y^p n x n (t)j dt 



subject to: 



x n (t) dt 



(2) 



(3) 



and x n (t) G {0, 1} for all n = 1, . . . , N and t £ [0, T}. The 
constraint implies that each task should be completed by its 
respective deadline. 

The problem above is combinatorial in nature due to binary- 
valued functions x n (t). A lower bound in the optimal cost is 
obtained if we relax x n (t) to be continuous-valued functions, 
so that < x n (t) < 1. This relaxation allows us to capture the 
scenario of varying instantaneous power level for each task n; 
at time t, p n x n (t) denotes the instantaneous consumed power 
by demand task n. For each n = 1, . . . , N, define the set of 
functions that satisfy feasibility condition (01, 



T n = {x„{t) : / x n lt) dt = s n } 



(4) 



with < x n (t) < 1 for all t G [0,T]. 

The following fluid model captures the continuous-valued 
problem. Consider the following bipartite graph U U V. There 
exist \U\ = N nodes on one side of the graph, one node 
for each task. Also, there exist |V| nodes, where each node k 
corresponds to the infinitesimal time interval \(k — l)dt,kdt] 
of length dt. From each node n = 1,...,\U\, we draw 
links towards infinitesimal time intervals that reside in interval 
[a n ,d n ]. Input flow p n s n enters each node n = 1,...,\U\. 
Let £(t) = ^2n =1 p n x n (t) denote the power load at time t, 
< t < T. 

The problem belongs to the class of problems that involve 
the sum (here, integral) of convex costs of loads at different 
locations (here, infinitesimal time intervals), 

min [ C(l(t)) dt, (5) 
Jo 

and for which the solution is load balancing across different 
locations fl9| . 
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For given load function £(t), define the operator T n on £(t) 

as: 

T n £(t) = arg min / CU{t))dt. (6) 

Now define a sequence of demand task indices {i^ }fc>i 
in which tasks are parsed. One such sequence is 
{1, . . . , n, 1, . . . , n, . . .}, where tasks are parsed one after the 
other according to their index in successive rounds. Consider 
the sequence of power load functions ^ k+1 \t) = 7l k £( k >(t), 
for = 1,2,.. .. For example, if = n, the problem 

min / C(p n x n (t) + y p k x k (t)) (7) 

with < x n {i) < 1, < t < T, is solved in terms of x n (t), 
while other functions Xk(t), k ^ n are kept unchanged. This is 
a convex optimization problem, for which the KKT conditions 
yield the solution function x n (t). Essentially this is the function 
that balances power load across times t E [0, T] as much as 
possible at that iteration. 

Theorem 1: The iterative load balancing algorithm that gen- 
erates the sequence of power load functions £( k+1 '(t) = 
7~i k £( k '(t), for k — 1,2,..., where operator T is defined by 
(O converges to the optimal solution for the continuous-valued 
problem 

Proof: In Q9] pp. 1403-1404] a proof methodology is 
developed for the case of discrete locations and discrete flow 
vectors. It is straightforward to extend this methodology to the 
instance described here, with the integral in the objective and 
functions x n (t) instead of the discrete vectors, to show that the 
sequence of power load functions v- k >{t), for k = 1, 2, . . . con- 
verges to the optimal solution X* for the original continuous- 
valued problem and that the final optimal set of functions, 
X* minimizes the maximum power load over all times t. The 
corresponding problem with binary-valued functions {x n (t)} 
has similar properties, as discussed in [19|. ■ 

2) Non-preemptive scheduling of tasks: Now, we consider 
the case of inelastic demands. Namely, we assume that, once 
scheduled to start, a task should be served uninterruptedly until 
completion. A discrete-time consideration is better suited to 
capture this case. Consider the following instance I of the 
problem. For each task n = 1, . . . , N, let the generation time 
a n = and the deadline d n = D, i.e. common for all tasks. 
Also assume that power requirements are the same, i.e. p n = p 
for all n. Fix a positive integer m, and consider the following 
decision version of the scheduling problem: Does there exist a 
schedule for the N tasks such that the maximum instantaneous 
consumed power is mpl 

Let us view each task n of duration s n as an item of size 
s„, and the horizon T = D as a bin of capacity D. Then, 
the question above can be readily seen to be equivalent to the 
decision version of the one-dimensional bin packing problem: 
"Does there exist a partition of the set of N items into m 
disjoint subsets (bins) U±, . . . ,U m , such that the sum of the 
sizes of items in each subset (bin) is D or less?" Clearly, 
each bin is one level of step p of power consumption. If m 



bins suffice to accommodate the N items, then the maximum 
instantaneous power consumption is rap, and vice versa. 

The optimization version of the one-dimensional bin packing 
problem is to partition the set of N items into the smallest 
possible number m of disjoint subsets (bins) Ui, . . . , U m such 
that the sum of the sizes of items in each subset (bin) is 
D or less. This is equivalent to the problem of finding a 
schedule of power demand tasks that minimizes the maximum 
power consumption over the time horizon T. Minimizing the 
maximum power consumption in the time horizon of duration 
T was shown to be equivalent to minimizing the total convex 
cost in the horizon. The decision version of bin packing is 
NP-Complete J20), and thus the optimization version of bin 
packing is NP-Hard. It can thus be concluded that finding a 
schedule that minimizes the total convex cost in the horizon is 
an NP-Hard problem. 

For different generation times a n and deadlines d n , one can 
easily create instances that are equivalent to the bin packing 
problem. For different power requirements p n , one way to 
proceed is to show equivalence with bin packing by defining a 
minimum quantum A p of power requirements and by observing 
that a task with power requirement p n = nA p and duration s n 
is equivalent to n tasks of size A p and duration s n . 

IV. The Online Dynamic Demand Scheduling 
Problem 

We now consider the online dynamic version of the schedul- 
ing problem. This captures the scenario where demands are 
generated continually and scheduling decisions need to be taken 
online as the system evolves. Power demand requests arrive at 
the grid operator controller according to a Poisson process, with 
average rate A requests per unit of time. The time duration 
s n of each power demand request n = 1,2,... is a random 
variable that is exponentially distributed with parameter s, i.e 
Pr(s„ < x) = 1 — er sx ' , with x > 0. Equivalently, the mean 
request duration is 1/s time units, and s is the average service 
rate for power demand tasks. The durations of different requests 
are independent random variables. 

The deadline d n of each request n = 1,2,... is also 
exponentially distributed with parameter d, i.e. Pr(d n < x) = 
1 — e~ dx , with x > 0. Thus, the mean deadline is 1/d time 
units, and d may be viewed as the deadline expiration rate. 
Deadlines of different requests are independent. 

We are interested in minimizing the long-run average cost 

lim f T C{P{t))dt] =E[C(P(t))}, (8) 

where the expectation is with respect to the stationary distri- 
bution of P(t). A remark is in place here about the nature of 
system state that is assumed to be available to the grid operator 
controller. The controller can measure total instantaneous power 
consumption. This is a readily available type of state and a basic 
one on which control decisions should rely. There also exist 
other evolving parameters that could enhance system state, but 
we refrain from using these for decision making in this paper, 
mainly because our primary objective is on understanding the 
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structure of simple control policies first before proceeding to 
more composite ones. 

A. Default Policy: No scheduling 

Consider the default, naive policy where each power demand 
is activated by the controller immediately upon its generation, 
namely there is no scheduling regulation of demand tasks. This 
policy is oblivious to instantaneous power consumption P(t) 
and all other system parameters. 

1 ) Fixed power requirement per task: First, assume that the 
power requirement of each task is fixed and unit, i.e. p n = 
1. The instantaneous power consumption at time t is P(t) = 
N(t), where N(t) is the number of active demands at time 
t. Under the assumptions stated above on the demand arrival 
and service processes, N(t) (and thus P(t)) is a continuous- 
time Markov chain. In fact, since each power demand task is 
always activated (served) upon arrival and there is no waiting 
time or loss, we can view P(t) as the occupation process of an 
M/M/oo service system. From state P(t), there are transitions 
to state: 

• P(t) + 1 with rate A, when new demand requests arrive. 

• P(t) — 1 with rate P(t)s, when one of the current P(t) 
active demands is completed. 

Through steady-state probabilities = liirit_>oo Pr(P(i) = i), 
i = 1,2,..., and equilibrium equations we can obtain the 
steady-state probability distribution of the number of active 
power demand tasks, 



-A/s 



(9) 



which is Poisson distributed with parameter A/s. The same 
steady-state distribution emerges for an M/G/oo queue lETl . 
Thus, the expected number of active requests in steady steady 
is simply E[P(t)] = -, where the expectation is with respect 
to the stationary Poisson distribution of P(t). As a result, the 
total expected cost is 



E[C(P(t))]=E[C7(P(i))] = 5>C(i) 



(10) 



Given the cost function C(-), we can get the expression for the 
total expected cost. 

2) Variable power requirement per task: The extension to 
different power requirements of tasks is done by reasoning as 
follows. Suppose that the power requirement of each task, P is a 
random variable that obeys a discrete probability distribution on 
values {pi, . , . ,Pl} with associated probabilities W\, . . . , 
(the case of continuous probability distribution of P is tackled 
similarly). Random variable P is taken to be independent from 
process N(t). Let E[P] = 2~2k=iPk w k be the expected value 
of power requirement. Power consumption at time t is P(t) = 
P ■ N(t), and the average power consumption at steady state is 
E[P(t)} = XE[P]/s. 

This becomes obvious by the following analogy. For fixed, 
unit power requirements, p n = 1, a demand request that arrives 
in infinitesimal time interval [fcA, (fc + l)A] goes to one server 



in the M/M/oo system and is served; at that interval, the arrival 
rate is ^. If p n = n, the situation is as if n servers are occupied, 
or equivalently n requests of unit power requirement appear in 
the same interval, and the arrival rate is n ■ t . Thus, an average 
power requirement E[P] is equivalent to an average arrival rate 
AE[P] of requests with unit power requirement. 

The total expected cost is found by taking expectation with 
respect to both the distribution of N(t) and P, 

oo L 

E[P • C(N(t))] =J2J2 1^°^ ■ w ^ • (1 !) 

i=0 k=l 

The default policy described above activates each task upon 
arrival without taking into account system state information. 

B. A Universal Lower Bound 

We now derive a lower bound on the performance of any 
scheduling policy in terms of total expected cost. 

Theorem 2: The performance of any scheduling policy is at 
least C (aE[P]/sY 

Proof: We use Jensen's inequality which says that for a 
random variable X and convex function C(-), it is E[C(X)] > 
C{E[X}). Equality holds if and only if X = E[X], i.e when 
random variable X is constant. Jensen's inequality in our case 
means 

E[C(P(i))] > C(E[P(t)}) . (12) 

We now argue that this lower bound is universal for all schedul- 
ing policies. A scheduling policy essentially shifts arising 
power demand tasks in time. These time shifts alter instanta- 
neous power consumption P{t) and thus they can also change 
the steady-state distribution of Pit). However, the average 
power consumption E[P(f)] in the system always remains the 
same. 

To see this more clearly, consider the subsystem that includes 
only the power demands under service currently. The arrival 
rate at the subsystem is AE[P], and the time spent by a customer 
in the subsystem is 1/s regardless of the control policy. By 
using Little's theorem, we get that the average number of 
customers in the subsystem (which also denotes the average 
power consumption) is fixed, E[P(t)] = AE[P]/s, and the 
proof is completed. ■ 

As will be shown in the sequel, this bound is asymptotically 
tight as the deadlines become larger. In other words, we will 
show that there exists a policy that is asymptotically optimal 
and achieves the bound. 

C. An Asymptotically Optimal Policy: Controlled Release 

Without loss of generality, assume unit power requirements, 
p n = 1. Consider the following threshold based control policy. 
There exists a threshold Pq. Upon arrival of a new request at 
time t, the controller checks current power consumption P(t). 
If P{t) < Po, the demand request is activated, otherwise it 
is queued. Queued demands are activated either when their 
deadline expires or when the power consumption P(t) drops 
below Po. We refer to this policy as the Controlled Release 
( CR ) policy. 
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u(t)=l (activate immediately) 



control u(t) 




(demand completion) 
P(t)s 
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active power 
demands 



Fig. 4. Depiction of optimal threshold policies. The Threshold Postponement 
(TP) policy 7Tj, is depicted above. The Enhanced Threshold Postponement 
(ETP) policy 7r e follows the rationale depicted above, with rate Q(t)d 
substituted by Q(t)d + P(t)s ■ l[P(t) < P e ]. The control u(t) € {0,1} 
is applied based on corresponding thresholds P;, , P e on power consumption 
P{t). 



Theorem 3: The CR policy is asymptotically optimal in the 
sense that for optimized threshold, as the deadlines increase, 
its performance converges to the lower bound of all policies, 
C(E[P(t)])=C{$). 

Proof: We provide a sketch of the proof. Consider an 
auxiliary system, Saux that is like the one described in the CR 
policy above, except that there are no deadline considerations. 
That is, in <Sauxi upon arrival of a demand request at time t, the 
controller checks power consumption P(t). If Pit) < Pa, the 
demand request is activated, otherwise it is queued. Queued 
demands are activated when the power consumption drops 
below Prj. 

Clearly, in the auxiliary system, requests are queued only 
when the upper bound Pq on power consumption is exceeded. 
Essentially Saux is equivalent to an M/M/c queueing system, 
with c = Pq "servers" lETl Section 3.4]. From Little's theorem, 
the average number of power demands in the system is A(- + 
W), where W is the average waiting time of a request in the 
queue until it gets activated. Define the occupation rate per 
server, p = A/ (cs). The average number of power demands in 
the system is written as cp + \W . Note that term cp denotes 
the expected number of busy servers at steady-state. 

Now define a sequence of thresholds PJ 1 = - + e„, n = 

1,2,..., where e„ is chosen so that lim„_| yoo e„ = 0. Note that 

a sequence of occupation ratios p n , n = 1, 2, . . ., accordingly 



emerges, with p n = A/(c n s) = \/{Pqs), and 



lim p n 



lim 



A 



si- + e n ) 
s 



(13) 



and therefore in the limit, the number of busy servers is 
constant, A/s with probability 1. This implies that ( TT2l holds 
with equality and therefore the expected cost for the auxiliary 
system is C'(-), which is precisely the universal lower bound 
derived above. 

Consider now the original system with the CR policy. 
Queued requests are activated either when power consumption 
drops below Pq or when the deadlines expire. The latter occurs 
with average deadline expiration rate d. As average deadline 
durations 1/d increase, the deadline expiration rate d goes to 
0, and the original system tends to behave like the auxiliary 



system Saux- Since the performance of CR policy converges 
to that of the auxiliary system as the deadlines increase, and 
the performance of the auxiliary system asymptotically achieves 
the lower bound above, it follows that the CR policy is also 
asymptotically optimal as deadlines increase. ■ 

D. Optimal Threshold Based Control Policies 

In this section we describe two power demand control 
policies that rely on instantaneous power consumption to make 
their decisions, yet their associated control spaces differ. We 
omit the proofs of optimality due to space limitations. 

1) Bi-modal control space: First, we consider the class of 
bi-modal control policies for which the control space for each 
power demand task n is bi-modal, namely lib = {0, D n }. That 
is, each demand n is either scheduled immediately upon arrival, 
or it is postponed to the end, such that it is completed precisely 
at the time when its deadline expires. Without loss of generality, 
we assume that power requirements are fixed, p n = 1. 

We consider the following threshold policy 7Tb. At the time 
of power demand request arrival t, the controller makes the 
decision whether the demand will be served immediately or 
at the end of its deadline. If the total instantaneous power 
consumption P(t) is less than a threshold p,, the controller 
serves the power demand request immediately. Otherwise, if 
Pit) > Pb, it postpones the newly generated request to the end 
of its deadline. We call this policy, the Threshold Postponement 
(TP) policy. 

The system state at time t is described by the pair of positive 
integers (P(i),Q(t)) where P(t) is the number of demands 
that consume power at time t and Q(t) is the number of 
postponed demands. Observe that there is an additional source 
of demand requests that enter power consumption with rate 
Qit)d where d is the rate of deadline expiration. 

Assuming that demand durations and deadlines are expo- 
nential and homogeneous and demand power level is fixed, 
(P(t),Q(t)) is a controlled continuous time Markov chain. 
Define the control function u(i) = 1 if newly arrived demands 
are activated immediately, and u(t) = if they are postponed 
until their deadlines expire. The transitions that describe the 
continuous time evolution of the Markov chain are as follows. 
From state (P(t),Q(t)), there is transition to the state: 

• (P(t) + l,Q(t)) with rate Au(i), which occurs when a 
new arriving demand is activated immediately. 

• (P(i), Q{t) + 1) with rate A(l— u(t)), when a new demand 
is postponed and joins the queue of postponed demands. 

• (P(t) — l,Q(t)) with rate P(t)s, due to completion of 
active demands. 

• (P(t) + 1, Q(t) — 1) with rate Q{t)d, due to expiration of 
deadlines of postponed demands. 

When P(t) < Pb then u(t) = 1. Then, P(i) varies with rate 
X+Q(t)d— P(t)s due to new requests, expirations of deadlines 
of postponed requests and completions of active demands. In 
the same case, Q(t) decreases with rate Q(t)d. On the other 
hand, when P(t) > p, then u(t) = 0; P(i) varies with rate 
Qit)d - P(t)s, while Q(t) varies with rate A - Q(t)d. The 
rationale of the TP policy is shown in Fig. [4] 
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Theorem 4: The policy that minimizes E[C(P(t))] over all 
bi-modal control policies with control space Ub is of threshold 
type, where the threshold is a switching curve Pb(Q) that 
is non-decreasing in terms of Q. For appropriately selected 
switching curve Pb(Q), the TP policy above is optimal. 

The proof is based on showing that the infinite horizon 
discounted-cost problem 

min lim / ^C(P(t))dt (14) 

with discount factor /3 < 1 admits a stationary optimal control 
policy. The long-run average cost problem is then treated as a 
limiting case of the discounted-cost problem as j3 — > 1 and has 
a stationary policy as well El . 

Some intuition on the form of the switching curve could be 
obtained as follows. There must exist a value of P(t), Pb(Q), 
beyond which it is more probable to induce lower cost by 
serving a demand in the future than by serving it immediately 
with the current cost. From the transition rates above, observe 
that the likelihood of reducing P(t) increases with increasing 
P(t). Furthermore, the likelihood of reducing P(t) goes down 
with increasing Q(t), which means that it is more possible to 
increase P(t) with increasing Q(t). This seems to imply that 
threshold Pb(Q) is a non-decreasing function of Q. 

2) Enhanced control space: Consider now the enhanced 
policy 7r e . At the time of power demand request arrival t, 
the controller makes the decision whether the demand will be 
served immediately or at the end of its deadline. If the total 
instantaneous power consumption P(t) < P e , the controller 
serves the power demand request immediately. Also, in this 
case, whenever an active power demand is completed, a post- 
poned demand from the queued ones is activated. Otherwise, if 
P(t) > P e , it postpones the newly generated request to the end 
of its deadline. Whenever the deadline of the demand expires, 
the demand is activated. This policy has the additional degree 
of freedom to schedule a demand after it is generated and 
before its deadline is expired. The control space for this policy 
is U e = {[a n ,D n ] for n = 1,2,...}, and clearly U e D Ub- 
We call this policy, Enhanced Threshold Postponement (ETP) 
policy. 

The system state at time t is again described by (P(t), Q(t)) 
where P(t) is the number of demands that consume power at 
time t and Q(t) is the number of postponed demands. The 
control function is again defined to be u(t) = 1 if newly 
arrived demands are activated immediately, and u{t) = if 
they are postponed until their deadlines expire. The transitions 
from state {Pit), Q(t)) in the Markov chain are towards state: 

• (P(t) + l,Q(t)) with rate Xu(t), which occurs when a 
new arriving demand is activated immediately. 

• (P(t), Q(t) + 1) with rate A(l— u(t)), when a new demand 
is postponed and joins the queue of postponed demands. 

• (P(i) + 1, Q{t) — 1) with rate Q(t)d, due to expiration of 
deadlines of postponed demands. 

. (P(t) - l,Q(t)) with rate P(t)s(l - u(t)), due to com- 
pletion of active demands, and no activation of queued 
demands. 



. (P(t),Q(t) - 1) with rate P(t)su(t), due to completion 
of active demands, and simultaneous activation of queued 
demands (that is why P(t) does not change). 

When P(t) < P b then u(t) = 1. Then, P(t) varies with rate 
X + Q(i)d — P(t)s due to new requests, expirations of deadlines 
of postponed requests and completions of active demands. In 
the same case, Q(t) decreases with rate Q(t)d. On the other 
hand, when P(t) > Pb then u(t) = 0; P(t) varies with rate 
Q{t)d - P(t)s, while Q(t) varies with rate A - Q(t)d. The 
rationale of the TP policy is shown in Fig. [4] 

When P(t) < P e then u(t) = 1. Then, P(t) increases with 
rate A + Q(t)d due to the arriving request rate and the rate 
with which deadlines of postponed requests expire. Also P{t) 
decreases with rate P(t)s due to completion of active demands, 
but it also increases with rate P(t)s since queued demands enter 
service whenever active ones are completed. When P(t) < P e , 
Q(t) decreases with rate Q(t)d + P(t)s. On the other hand, 
when P{t) > P e then u(t) = 0; P{t) varies with rate Q(t)d - 
P(t)s, while Q(t) varies with rate A — Q(t)d. The ETP policy 
7r e follows the rationale depicted in Fig. |4] with rate Q(t)d 
substituted by Q(t)d + P(t)s ■ l(P(t) < P e ), where l(-) is the 
indicator function. 

Theorem 5: The policy that minimizes E[C(P(t))] over all 
control policies with control space U e is of threshold type. For 
appropriately selected threshold P e , the ETP policy is optimal. 

Here, the threshold P e does not depend on Q(t), since P{t) 
is fed with queued demands when a demand is completed, and 
therefore it will remain approximately around a fixed threshold 
as long as Q(t) is not empty. 

V. Discussion 

In this work, we took a first step towards bringing control 
and optimization theory in the context of smart power grid. We 
focused on a scenario where control of consumer appliances 
is fully delegated to the grid operator, and we studied the 
fundamental problem of smoothing the power demand pro- 
file so as to minimize the grid operational cost over some 
time horizon and promote efficient energy management. This 
problem is envisioned to be a central one with smart grid- 
enabled appliances and two-way communications between the 
provider and consumers. First, we studied the off-line version 
of the scheduling problem. The optimal solution was derived 
for elastic demands that allow preemptive scheduling, while 
for inelastic demands that require non-preemptive scheduling 
the problem is NP-Hard. In light of a dynamically evolving 
system and the need for online scheduling decisions, we stud- 
ied long-term expected cost through a stochastic model. Our 
main result is a threshold scheduling policy with a threshold 
on instantaneous power consumption, which is asymptotically 
optimal in the sense of achieving a universal lower performance 
bound as deadlines increase. We have also proposed two control 
instances with different control spaces. In the first one, the 
controller may choose to serve a new demand request upon 
arrival or to postpone it to the end of its deadline. The second 
one has the additional option to activate one of the postponed 
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demands when an active demand terminates. For both instances, 
the optimal policies are threshold-based. 

There exist many issues for investigation. For the thresh- 
old based policies that we described, an elaborate study and 
derivation of the structure of the policies and threshold values 
would enhance our study. Some input from real-life power grid 
systems in terms of the operational cost and power demand 
statistics would positively modulate the process of explicit 
computation of thresholds. 

For the scenario envisioned in this paper, the incorporation 
in the model of different classes of power demand tasks with 
different inherent constraints is of great interest. Different 
classes of tasks, some of which were captured by the current 
formulation, are as follows: 

> Demands that may have fixed power requirement and zero 
time tolerance in scheduling, e.g. lights. 

> Demands that have fixed power requirements and there 
exists some flexibility in scheduling within a certain time 
window, e.g. washing machine or dishwasher. 

• Demands that have flexibility both on the power demand 
and the duration. Some of these may need to be peri- 
odically turned on and off by the operator, like the air 
conditioning. 

« Special types of demands. For example, in the task of 
charging electric vehicles, there exist constraints on the 
total amount of energy needed to charge the battery and on 
the time interval by which charging needs to be completed. 
Charging may take place at nonconsecutive time intervals 
and with adaptable charging rate. The latter results in a 
flexibility in tuning instantaneous power demand. 

Especially the possibility of controlling the power consumption 
level of appliances in addition to time scheduling adds a new 
dimension to the problem. Such scenarios have already started 
finding their way in instances where the consumption level of 
consumer A/C is controlled by the operator. The derivation of 
optimal control policies in this context is an interesting issue. 

In this work, we assumed that the provider has full control 
over consumer appliances, and these always comply to the 
dictated schedule. A lot of other scenarios could be envisioned. 
For instance, some freedom may be granted to the consumer 
to select whether the announced schedule by the provider will 
be admitted or not. Some incentives from the provider side 
could also be considered in that case, like reduced prices 
if users comply to the schedule. If continuous feedback on 
instantaneous price per unit of power demand is provided 
by the operator, the user would need to decide whether to 
activate the demand immediately and pay the instantaneous 
price, or postpone the demand for a later time, if such an option 
exists, with the hope that the price becomes lower. Another 
possibility in that case could be that each consumer makes 
its proposition to the provider in terms of defining its time 
flexibility in scheduling according to the announced price. Each 
of the scenarios above gives rise to interesting mathematical 
models of interaction that warrant investigation. 
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